AITopics | societal bias

Collaborating Authors

societal bias

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LOTUS: A Leaderboard for Detailed Image Captioning from Quality to Societal Bias and User Preferences

Hirota, Yusuke, Li, Boyi, Hachiuma, Ryo, Wu, Yueh-Hua, Ivanovic, Boris, Nakashima, Yuta, Pavone, Marco, Choi, Yejin, Wang, Yu-Chiang Frank, Yang, Chao-Han Huck

arXiv.org Artificial IntelligenceDec-2-2025

Large Vision-Language Models (LVLMs) have transformed image captioning, shifting from concise captions to detailed descriptions. We introduce LOTUS, a leaderboard for evaluating detailed captions, addressing three main gaps in existing evaluations: lack of standardized criteria, bias-aware assessments, and user preference considerations. LOTUS comprehensively evaluates various aspects, including caption quality (e.g., alignment, descriptiveness), risks (\eg, hallucination), and societal biases (e.g., gender bias) while enabling preference-oriented evaluations by tailoring criteria to diverse user preferences. Our analysis of recent LVLMs reveals no single model excels across all criteria, while correlations emerge between caption detail and bias risks. Preference-oriented evaluations demonstrate that optimal model selection depends on user priorities.

caption, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2025.acl-industry.22

2507.19362

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.82)

Add feedback

CoBia: Constructed Conversations Can Trigger Otherwise Concealed Societal Biases in LLMs

Nikeghbal, Nafiseh, Kargaran, Amir Hossein, Diesner, Jana

arXiv.org Artificial IntelligenceOct-14-2025

Improvements in model construction, including fortified safety guardrails, allow Large language models (LLMs) to increasingly pass standard safety checks. However, LLMs sometimes slip into revealing harmful behavior, such as expressing racist viewpoints, during conversations. To analyze this systematically, we introduce CoBia, a suite of lightweight adversarial attacks that allow us to refine the scope of conditions under which LLMs depart from normative or ethical behavior in conversations. CoBia creates a constructed conversation where the model utters a biased claim about a social group. We then evaluate whether the model can recover from the fabricated bias claim and reject biased follow-up questions. We evaluate 11 open-source as well as proprietary LLMs for their outputs related to six socio-demographic categories that are relevant to individual safety and fair treatment, i.e., gender, race, religion, nationality, sex orientation, and others. Our evaluation is based on established LLM-based bias metrics, and we compare the results against human judgments to scope out the LLMs' reliability and alignment. The results suggest that purposefully constructed conversations reliably reveal bias amplification and that LLMs often fail to reject biased follow-up questions during dialogue. This form of stress-testing highlights deeply embedded biases that can be surfaced through interaction. Code and artifacts are available at https://github.com/nafisenik/CoBia.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.09871

Country:

Europe (1.00)
North America > United States (0.93)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Debiasing Large Vision-Language Models by Ablating Protected Attribute Representations

Ratzlaff, Neale, Olson, Matthew Lyle, Hinck, Musashi, Tseng, Shao-Yen, Lal, Vasudev, Howard, Phillip

arXiv.org Artificial IntelligenceOct-17-2024

Large Vision Language Models (LVLMs) such as LLaVA have demonstrated impressive capabilities as general-purpose chatbots that can engage in conversations about a provided input image. However, their responses are influenced by societal biases present in their training datasets, leading to undesirable differences in how the model responds when presented with images depicting people of different demographics. In this work, we propose a novel debiasing framework for LVLMs by directly ablating biased attributes during text generation to avoid generating text related to protected attributes, or even representing them internally. Our method requires no training and a relatively small amount of representative biased outputs ( 1000 samples). Our experiments show that not only can we can minimize the propensity of LVLMs to generate text related to protected attributes, but we can even use synthetic data to inform the ablation while retaining captioning performance on real data such as COCO. Furthermore, we find the resulting generations from a debiased LVLM exhibit similar accuracy as a baseline biased model, showing that debiasing effects can be achieved without sacrificing model performance.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.13976

Country:

North America > United States > California > Santa Clara County > Santa Clara (0.04)
Europe > Middle East > Malta > Eastern Region > Northern Harbour District > St. Julian's (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

She Works, He Works: A Curious Exploration of Gender Bias in AI-Generated Imagery

Foka, Amalia

arXiv.org Artificial IntelligenceJul-26-2024

The representation of gender within visual culture has been a fertile ground for critical inquiry, particularly within feminist scholarship. Griselda Pollock's seminal work, Vision and Difference (1988) [2], established a foundational framework for understanding how visual representations of women in art are not merely aesthetic choices, but are deeply intertwined with societal power dynamics and gender ideologies. Pollock's analysis demonstrates how these r epresentations often function as "signs" that reinforce traditional gender roles and limit female agency, inspiring generations of scholars to scrutinize the ways visual culture shapes our understanding of gender and other social identities. This theoretic al framework provides a critical lens through which to examine potential biases in AI -generated art and its impact on contemporary representations of gender. Following Pollock's groundbreaking work, feminist scholarship in visual culture has con nued to evolve and expand.

gender, representa, stereotype, (15 more...)

arXiv.org Artificial Intelligence

2407.18524

Country:

Europe > Greece > Epirus > Ioannina (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.96)

Add feedback

Disability Representations: Finding Biases in Automatic Image Generation

Tevissen, Yannis

arXiv.org Artificial IntelligenceJun-21-2024

Recent advancements in image generation technology have enabled widespread access to AI-generated imagery, prominently used in advertising, entertainment, and progressively in every form of visual content. However, these technologies often perpetuate societal biases. This study investigates the representation biases in popular image generation models towards people with disabilities (PWD). Through a comprehensive experiment involving several popular text-to-image models, we analyzed the depiction of disability. The results indicate a significant bias, with most generated images portraying disabled individuals as old, sad, and predominantly using manual wheelchairs. These findings highlight the urgent need for more inclusive AI development, ensuring diverse and accurate representation of PWD in generated images. This research underscores the importance of addressing and mitigating biases in AI models to foster equitable and realistic representations.

disability, representation, wheelchair, (14 more...)

arXiv.org Artificial Intelligence

2406.14993

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > France (0.04)

Genre: Research Report (0.41)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Detecting Bias in Large Language Models: Fine-tuned KcBERT

Lee, J. K., Chung, T. M.

arXiv.org Artificial IntelligenceMar-15-2024

The rapid advancement of large language models (LLMs) has enabled natural language processing capabilities similar to those of humans, and LLMs are being widely utilized across various societal domains such as education and healthcare. While the versatility of these models has increased, they have the potential to generate subjective and normative language, leading to discriminatory treatment or outcomes among social groups, especially due to online offensive language. In this paper, we define such harm as societal bias and assess ethnic, gender, and racial biases in a model fine-tuned with Korean comments using Bidirectional Encoder Representations from Transformers (KcBERT) and KOLD data through template-based Masked Language Modeling (MLM). To quantitatively evaluate biases, we employ LPBS and CBS metrics. Compared to KcBERT, the fine-tuned model shows a reduction in ethnic bias but demonstrates significant changes in gender and racial biases. Based on these results, we propose two methods to mitigate societal bias. Firstly, a data balancing approach during the pre-training phase adjusts the uniformity of data by aligning the distribution of the occurrences of specific words and converting surrounding harmful words into non-harmful words. Secondly, during the in-training phase, we apply Debiasing Regularization by adjusting dropout and regularization, confirming a decrease in training loss. Our contribution lies in demonstrating that societal bias exists in Korean language models due to language-dependent characteristics.

arxiv preprint arxiv, probability, social bias, (14 more...)

arXiv.org Artificial Intelligence

2403.10774

Country:

Asia > Afghanistan (0.04)
Europe (0.04)
Asia > South Korea > Gyeonggi-do > Suwon (0.04)
(2 more...)

Genre: Research Report (0.83)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Investigating Bias Representations in Llama 2 Chat via Activation Steering

Lu, Dawn, Rimsky, Nina

arXiv.org Artificial IntelligenceFeb-1-2024

We address the challenge of societal bias in Large Language Models (LLMs), focusing on the Llama 2 7B Chat model. As LLMs are increasingly integrated into decision-making processes with substantial societal impact, it becomes imperative to ensure these models do not reinforce existing biases. Our approach employs activation steering to probe for and mitigate biases related to gender, race, and religion. This method manipulates model activations to direct responses towards or away from biased outputs, utilizing steering vectors derived from the StereoSet dataset and custom GPT4-generated gender bias prompts. Our findings reveal inherent gender bias in Llama 2 7B Chat, persisting even after Reinforcement Learning from Human Feedback (RLHF). We also observe a predictable negative correlation between bias and the model's tendency to refuse responses. Significantly, our study uncovers that RLHF tends to increase the similarity in the model's representation of different forms of societal biases, which raises questions about the model's nuanced understanding of different forms of bias. This work also provides valuable insights into effective red-teaming strategies for LLMs using activation steering, particularly emphasizing the importance of integrating a refusal vector.

llama 2, societal bias, vector, (13 more...)

arXiv.org Artificial Intelligence

2402.00402

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Potential Societal Biases of ChatGPT in Higher Education: A Scoping Review

Li, Ming, Enkhtur, Ariunaa, Yamamoto, Beverley Anne, Cheng, Fei

arXiv.org Artificial IntelligenceNov-24-2023

ChatGPT and other Generative Artificial Intelligence (GAI) models tend to inherit and even amplify prevailing societal biases as they are trained on large amounts of existing data. Given the increasing usage of ChatGPT and other GAI by students, faculty members, and staff in higher education institutions (HEIs), there is an urgent need to examine the ethical issues involved such as its potential biases. In this scoping review, we clarify the ways in which biases related to GAI in higher education settings have been discussed in recent academic publications and identify what type of potential biases are commonly reported in this body of literature. We searched for academic articles written in English, Chinese, and Japanese across four main databases concerned with GAI usage in higher education and bias. Our findings show that while there is an awareness of potential biases around large language models (LLMs) and GAI, the majority of articles touch on ``bias'' at a relatively superficial level. Few identify what types of bias may occur under what circumstances. Neither do they discuss the possible implications for the higher education, staff, faculty members, or students. There is a notable lack of empirical work at this point, and we call for higher education researchers and AI experts to conduct more research in this area.

arxiv preprint arxiv, chatgpt, language model, (14 more...)

arXiv.org Artificial Intelligence

2311.14381

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Oceania > New Zealand (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting > Higher Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)

Add feedback

Social Bias Probing: Fairness Benchmarking for Language Models

Manerba, Marta Marchiori, Stańczak, Karolina, Guidotti, Riccardo, Augenstein, Isabelle

arXiv.org Artificial IntelligenceNov-15-2023

Large language models have been shown to encode a variety of social biases, which carries the risk of downstream harms. While the impact of these biases has been recognized, prior methods for bias evaluation have been limited to binary association tests on small datasets, offering a constrained view of the nature of societal biases within language models. In this paper, we propose an original framework for probing language models for societal biases. We collect a probing dataset to analyze language models' general associations, as well as along the axes of societal categories, identities, and stereotypes. To this end, we leverage a novel perplexity-based fairness score. We curate a large-scale benchmarking dataset addressing drawbacks and limitations of existing fairness collections, expanding to a variety of different identities and stereotypes. When comparing our methodology with prior work, we demonstrate that biases within language models are more nuanced than previously acknowledged. In agreement with recent findings, we find that larger model variants exhibit a higher degree of bias. Moreover, we expose how identities expressing different religions lead to the most pronounced disparate treatments across all models.

category, language model, stereotype, (16 more...)

arXiv.org Artificial Intelligence

2311.0909

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

Human-AI Interactions and Societal Pitfalls

Castro, Francisco, Gao, Jian, Martin, Sébastien

arXiv.org Artificial IntelligenceOct-12-2023

Generative artificial intelligence (AI) systems, particularly large language models (LLMs), have improved at a rapid pace. For example, ChatGPT recently showcased its advanced capacity to perform complex tasks and human-like behaviors (OpenAI 2023b), reaching 100 million users within two months of its 2022 launch (Hu 2023). This progress is not limited to text generation, as demonstrated by other recent generative AI systems such as Midjourney (Midjourney 2023) (a text-to-image generative AI) and GitHub Copilot (Github 2023) (an AI pair programmer that can autocomplete code). Eloundou et al. (2023) estimated that about 80% of the U.S. workforce could be affected by the introduction of LLMs, and 19% of the workers may have at least 50% of their tasks impacted. In particular, AI can make users more productive by generating complex content in seconds, while users can simply communicate their preferences. For example, Noy and Zhang (2023) highlighted that ChatGPT can substantially improve productivity in writing tasks, and GitHub claims that Copilot increases developer productivity by up to 55% (Kalliamvakou 2023). However, content generated with the help of AI is not exactly the same as content generated without AI. The boost in productivity may come at the expense of users' idiosyncrasies, such as personal style and tastes, preferences we would naturally express without AI. To let users express their preferences, many AI systems let users edit their prompt (e.g., Midjourney) or allow more

information, interaction, lemma ec, (16 more...)

arXiv.org Artificial Intelligence

2309.10448

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Kosovo > District of Gjilan > Kamenica (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Health & Medicine (0.67)
Law > Civil Rights & Constitutional Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback